{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pandas-沿时间轴移动数据"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "内容介绍:\n",
    "\n",
    "* 移动指的是沿时间轴将数据前移或后移\n",
    "* 使用shift方法进行移动\n",
    "* Series和DataFrame都有shift方法\n",
    "* 作用：用于执行单纯的前移或后移操作，保持索引不变\n",
    "* 语法如下：\n",
    "\n",
    "shift(self, periods=1, freq=None, axis=0, fill_value=None) -> 'Series'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import datetime\n",
    "import pandas.tseries.offsets as offset"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.沿时间轴移动数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2018-01-31    0.435291\n",
       "2018-02-28    1.268614\n",
       "2018-03-31   -0.105448\n",
       "2018-04-30    0.030767\n",
       "Freq: M, dtype: float64"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 示例数据\n",
    "ts = pd.Series(np.random.randn(4),index=pd.date_range('2018-1-1',periods=4,freq='M'))\n",
    "ts"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2018-01-31         NaN\n",
       "2018-02-28         NaN\n",
       "2018-03-31    0.435291\n",
       "2018-04-30    1.268614\n",
       "Freq: M, dtype: float64"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#单纯前后移动\n",
    "#不加其他参数是数据移动，产生缺失数据\n",
    "ts.shift(2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2018-01-31   -0.105448\n",
       "2018-02-28    0.030767\n",
       "2018-03-31         NaN\n",
       "2018-04-30         NaN\n",
       "Freq: M, dtype: float64"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#数据向上移动\n",
    "ts.shift(-2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**时间序列数据移动的应用：可以计算时间序列的百分比**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2018-01-31           NaN\n",
       "2018-02-28    191.440209\n",
       "2018-03-31   -108.312083\n",
       "2018-04-30   -129.177534\n",
       "Freq: M, dtype: float64"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#计算ts数据的环比增长率\n",
    "(ts/ts.shift(1)-1)*100"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**移动索引，而不是数据的方法:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2018-03-31   -0.349219\n",
       "2018-04-30   -0.750028\n",
       "2018-05-31    0.245146\n",
       "2018-06-30    1.704405\n",
       "Freq: M, dtype: float64"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#设置freq参数后，索引移动，数据不动\n",
    "ts.shift(2,freq='M')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2018-02-02   -0.349219\n",
       "2018-03-02   -0.750028\n",
       "2018-04-02    0.245146\n",
       "2018-05-02    1.704405\n",
       "dtype: float64"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#设置freq参数后，索引移动，数据不动\n",
    "ts.shift(2,freq='D')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.通过偏移量对日期进行移动\n",
    "\n",
    "需要使用偏移量模块进行处理"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 需要进一步学习"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.时期的数字运算\n",
    "\n",
    "时期表示时间的区间，比如数日、数月或数年等。Period类表示时期数据类型。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Period('2018', 'A-DEC')"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 创建时期数据对象。'A-DEC'代表每年度12月31日\n",
    "p = pd.Period(2018,freq='A-DEC')\n",
    "p"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Period('2018', 'A-DEC')"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#也可以传入字符串\n",
    "p2 = pd.Period('2018',freq='A-DEC')\n",
    "p2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "时期的数字运算\n",
    "\n",
    "* 通过加减整数可以实现对Period的移动\n",
    "* 相同频率的Period对象的差，是他们之间的单位数量"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Period('2020', 'A-DEC')"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "p+2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "p3 = p-2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<2 * YearEnds: month=12>"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 相同的频率可以进行减法\n",
    "p2-p3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Help on function period_range in module pandas.core.indexes.period:\n",
      "\n",
      "period_range(start=None, end=None, periods=None, freq=None, name=None) -> pandas.core.indexes.period.PeriodIndex\n",
      "    Return a fixed frequency PeriodIndex.\n",
      "    \n",
      "    The day (calendar) is the default frequency.\n",
      "    \n",
      "    Parameters\n",
      "    ----------\n",
      "    start : str or period-like, default None\n",
      "        Left bound for generating periods.\n",
      "    end : str or period-like, default None\n",
      "        Right bound for generating periods.\n",
      "    periods : int, default None\n",
      "        Number of periods to generate.\n",
      "    freq : str or DateOffset, optional\n",
      "        Frequency alias. By default the freq is taken from `start` or `end`\n",
      "        if those are Period objects. Otherwise, the default is ``\"D\"`` for\n",
      "        daily frequency.\n",
      "    name : str, default None\n",
      "        Name of the resulting PeriodIndex.\n",
      "    \n",
      "    Returns\n",
      "    -------\n",
      "    PeriodIndex\n",
      "    \n",
      "    Notes\n",
      "    -----\n",
      "    Of the three parameters: ``start``, ``end``, and ``periods``, exactly two\n",
      "    must be specified.\n",
      "    \n",
      "    To learn more about the frequency strings, please see `this link\n",
      "    <https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases>`__.\n",
      "    \n",
      "    Examples\n",
      "    --------\n",
      "    >>> pd.period_range(start='2017-01-01', end='2018-01-01', freq='M')\n",
      "    PeriodIndex(['2017-01', '2017-02', '2017-03', '2017-04', '2017-05', '2017-06',\n",
      "             '2017-07', '2017-08', '2017-09', '2017-10', '2017-11', '2017-12',\n",
      "             '2018-01'],\n",
      "            dtype='period[M]', freq='M')\n",
      "    \n",
      "    If ``start`` or ``end`` are ``Period`` objects, they will be used as anchor\n",
      "    endpoints for a ``PeriodIndex`` with frequency matching that of the\n",
      "    ``period_range`` constructor.\n",
      "    \n",
      "    >>> pd.period_range(start=pd.Period('2017Q1', freq='Q'),\n",
      "    ...                 end=pd.Period('2017Q2', freq='Q'), freq='M')\n",
      "    PeriodIndex(['2017-03', '2017-04', '2017-05', '2017-06'],\n",
      "                dtype='period[M]', freq='M')\n",
      "\n"
     ]
    }
   ],
   "source": [
    "help(pd.period_range)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
